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ON EXISTENCE OF A CHANGE IN MEAN OF FUNCTIONAL DATA 


Buddhananda Banerjee°“ and Satyaki Mazumder 
Department of Mathematics & Statistics 
Indian Institnte of Science Edncation and Research, Kolkata^’^ 
bnddha.banerjee@iiserkol.ac.in“, satyaki@iiserkol.ac.in^ 

Abstract. Functional data often arise as sequential temporal observations over a contin¬ 
uous state-space. A set of functional data with a possible change in its structure may lead 
to a wrong conclusion if it is not taken in to account. So, sometimes, it is crucial to know 
about the existence of change point in a given sequence of functional data before doing any 
further statistical inference. We develop a new methodology to provide a test for detecting 
a change in the mean function of the corresponding data. To obtain the test statistic we 
provide an alternative estimator of the covariance kernel. The proposed estimator is asymp¬ 
totically unbiased under the null hypothesis and, at the same time, has smaller amount of 
bias than that of the existing estimator. We show here that under the null hypothesis the 
proposed test statistic is pivotal asymptotically. Moreover, it is shown that under alterna¬ 
tive hypothesis the test is consistent for large enough sample size. It is also found that the 
proposed test is more powerful than the available test procedure in the literature. From 
the extensive simulation studies we observe that the proposed test outperforms the existing 
one with a wide margin in power for moderate sample size. The developed methodology 
performs satisfactorily for the average daily temperature of central England and monthly 
global average anomaly of temperatures. 

Keywords: Change point detection, functional data analysis, covariance kernel. 


1. Introduction 


Functional data analysis (FDA) is becoming increasingly popular because of its wide 
applicability in various fields of statistics. The natural proximity of functional data to feature 
some real life observations is more appealing over its finite dimensional representation and at 
the same time it is ofte n notic ed that FDA leads to more accurate inference in this regard. 


Ramsey and Silverman! (120051) has enriched the literature with a detailed discussions on 


several techniques and usefulness of FDA. Some recent developments in many more aspects 


corresponding author: buddha.banerjee@iiserkol.ac.in. 
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of FDA can be found in 


Ferratvi (120111 ). However, the inference and especially the prediction 


may alter if there exists an inherent change in the stochastic structure of the functional 
data observed temporally. The change may occur at a unknown point of time with in the 
chronological sequence of data but it is always challenging to test, statistically, whether the 
change has occurred or not. For the cases of scalar and vector d at a a considerable a moun t 
of contributions can be found 


Davis et ah 


Kirch et ah 


(1995), 


Antoch et al 


r om t h e works by 

(H), 


Horvath et al. 


Cobb (197. 


Inclan and Tiaol (1994), 


199911 . iKokoszka and LeipusI (l2000l) . 


(1201411 and references therein, among many others. In the context of functional 
data a change may occur in the mean function or in the covariance kernel of the data or both. 
This pape r shades light on the disc ussion abou t the c hange in the mean function in particular. 


Recently, 


Berkes et al. 


(1200911 and 


Aue et al. 


(1200911 have proposed a method for detec ting 


changes in the mean functions of an observed set of functional data. 


Berkes et al 


(120091), in 


their pioneering work in this context, have provided an elegant test proce dure to dec i de the 
existence of a signihcan t amount of c h ange in the mean function, whereas 


following the method of 


Berkes et al. 


Aue et ah! ( 2009 1 


(I 2 OO 9 I ) have dealt with the detection of the position of 


the change in the mean function. In practice, both are equally important to judge whether 
there is a change in the mean function of the data and if there is a significant change at all 
then detecting the location of it. For example, while analyzing the temperature of a certain 
region over a long period of time, it is very important to environmentalist to identify the 
time point after which a signihcant change in the mean temperature is observed as a possible 
effect of global warming. In this paper we come up with a different methodology to analyze 
the functional data subject to a possible change point and propose a new statistical test, 
which is more powerful than the existing one(s), for detecting the presence of a change in 
the mean function of the data. Here we show that under the null hypothesis, i. e. with no 
change in the data, the propos ed test statist i c conv erges in distribution to a functional of the 

(1200911 . Moreover, we prove here that the test is 


Berkes et al. 


Brownian bridges, as shown in 
consistent under alternative hypothesis when the number of the observations becomes large 
enough. We provide an estimator of the covariance kernel which not only enjoys its property 


of consistency under the null hypothesis b n 


Berkes et al. 


also ha s less asynaptotic bias compared to that 
( 2009 1 or Aue et al. ( 2009 1 under the alternative 


of the estimator provided by 
hypothesis. Because of the reduction in the asymptotic bias while estimating the covariance 
kernel, we successfully obtain that the proposed test has better power than the existing 
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method by 


Berkes et al. 


(l2009l) . The outcomes of an extensive simulation study reflects the 


same. It is also noted that our method outperforms the existing method in a wide margin for 
small samples. Therefore, it is more advantageous to use the proposed method in practice for 
deciding with the presence of signihcant change in the mean of the functional data, specially 
when the data size is not big enough. 

The organization of the paper is as follows. In Section [2], we introduce the required 
notation and dehnitions for introducing the subject. The details of the model, discussed in 
the paper, are described in this section. Section [3] deals with the testing methodology and 
main results of the paper. In this section we provide the theorems about the consistency 
of the proposed estimator of covariance kernel, asymptotic null distribution and asymptotic 
consistency of the test procedure. In Section 0] simulations results are provided in great detail 
where we show that our method substantially improves over the existing method in terms of 
power of the test. In Section [5] we show the performance of our test in real data. Remarks 
and conclusion of the work are given in the Section [6l Finally we provide the required proofs 
of the results of section [3] in the Appendix (Section [T]). 


2. Preliminaries and assumptions 

Let , Xi{t) for i = 1,N, he Hilbert-valued random functions dehned over a compact 
set r = [0,1]. We assume that XiS are independent. We are interested to check the equality 
of the mean functions of Xj for all i = 1, 2, • • • , X. More precisely, the null hypothesis to 
test will be 

Ho : E{Xi{t)) = E{X2{t)) = ■■■ = EiX^it)). 

It is important to note that nothing is presumed about any property of the common mean 
under the null hypothesis. 

Under alternative hypothesis we assume that the null hypothesis Hq does not hold. We 
deal with the situation when the data contains at most one change point, however, in case of 
applications we elaborate how to implement this method with multiple change points case. 
In particular, in Section |5l we specihcally deal with the situation with more than one change 
points. There the data can be subdivided into several consecutive parts and within each 
part the mean function remains constant but it deviates between different contiguous parts. 
The details of the model with single change point is discussed in the sub-Section 12.11 
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Under the null hypothesis we express X,, i = 1,. .., iV, in the following manner. 

X,{t)=fi{t) + Y,{t) 

E{Y,{t))=0. (2.1) 

Now we specify the assumptions about mean function fi and random element Y), based on 
which the asymptotic behaviour of the test statistic can be determined. From here on words 
all integrations are computed over the compact set r, unless otherwise mentioned. 

2.1. Assumptions. 

Al. The mean function is square integrable that is, fi G T^(r), and the unobservable 
random component Yis, are independent and identically distributed random elements 
in T^(r) with 

E{Y,{t)) = 0 Vf G r, 

for i = 1 ,..., A and 

E\\Yi\\‘^ = j E(Y^^{t))dt <oo. (2.2) 

The covariance kernel is dehned as 


c(t, s) = EiYi(t)Yi{s)) t,s er (2.3) 

with the assumption that c{t, s) G L^(rxr). Assumption 1 implies that the covariance 
operator of U, which is a positive dehnite symmetric Hilbert-Schmidt (H-S) operator 
mapping from LF‘{t) to itself, will be of the form 

C{x) = E[{Y,x)Y]. (2.4) 


The evaluation of C{x) at f, i.e., C{x){t), is given by 


C{x){t) = / c{t,s)x{s)ds Wt E T. 


Moreover, Mercer’s theorem in (lindrita . Il963l . Chapter 4) implies that c{t, s) has the 
following spectral decomposition: 


c{t, s) = X'' v\t)v'' (s) t,S E T, 

l=l 


(2.5) 
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where each real scalar and function (in L‘^{t)) are dehned, for f G r, as 

C{v^){t) = AV(t), 1 = 1 , 2 ..., 

i.e. , J c(t, s)v\s) = \''v’'{t), 1 = 1 , 2 ,.... (2.6) 

In other words, A^s and v^s are the eigenvalues and the corresponding eigenfunctions 
respectively, of the operator C{.). Since the eigenfunctions of the positive dehnite 
symmetric operator, C{.), form a complete ortho normal basis of L‘^{t) and eigenvalues 
are positive, Karhunen-Loeve representation of Yi holds good in and is given 

by 

OO 

Y,(t) = (2.7) 

1=1 

where VYSI = {Yi,v’') = ln(s) v\s) is known as /th functional principal component 
score. By construction, the elements of the sequence {SlY are uncorrelated random 
variables with zero mean and unit variance and {S^Y and {SjY cire independent for 

A2. There exists some positive integer d, such that the eigenvalues A* satisfy 

A^ > A" > ... > A'^ > A'^+^ 


A3. Yi, i = 1,..., N, satisfy 

Em\\") = j E{YYt)Ydt<oo. 


A4. Under the alternative, with an existence of single change point the observations, Xj, 
i = 1,..., N can be represented as follows 


xYt) 


Hiit) + Yi(t), 1 <i < k* 
/i2(t) + Yi{t), k* <i < N 


( 2 . 8 ) 


where Yi, i = 1,..., N satisfy the assumption Al, fij(t), j = 1,2 are in T^(r) and k* 
= [NO], with 6 G (0,1). Therefore, we assume that under the alternative hypothesis 
of single change point a change may occur in the mean function but the covariance 
kernel remains the same before and after the change in the data. Keeping this in 
consideration we estimate the covariance kernel in the following section and develop 
a new methodology to test Hq. 



6 CHANGE IN MEAN OF FUNCTIONAL DATA 

3. Methodology and Main results 

To estimate the covariance kernel let us define the piecewise sample means for two seg¬ 
ments 

k 




k 


hfc(t) = 


2=1 

1 


N 


N-k 




(3.1) 

(3.2) 


2=Ai+1 


where k = [A^m] with u G (0,1), implying 1 < fc < iV. For n = 1 we define JiNit) = 


N 


With the help of equations (13.1 p and fl3.2p . the newly proposed estimator of 


N 

2=1 

covariance kernel is 


Cu{t,s) = 


N 


N 


^ (W(t) - Pfc(t)) - Pfc(s)) + ^ {Xi{t) - Hk{t)) {Xi{s) - fikis)) 


_ 2=1 


2 = /c + l 


. (3.3) 


For M = 1, we define Ci{t,s) = ^ YlZi (^i(^) “ hY(^)) (^i(s) - Uiv(s)) 


monly nse c 


Ane et al. 


as e stimator of covariance kernel, see for example. 


- /iv(s)) 

1 w 

Berkes e 

t al. 


rich is com- 


(I2nn9h and 


(120091 ). With the newly proposed estimator of the covariance kernel we obtain 


the most important finding of this paper which is narrated in the following theorem. 
Theorem 3.1. Defining c„(t,s) := c{t,s) + 9{1 — 9)A(t)A{s) fe{u)i under assumption A4, 

J j \pu(t, s) — cfit, s)]^ dtds 0, as N 'I oo, 

where, 

max{M, 0} - min{M, 0} ^ 

feW - - —- —r —^ e 0,1 

max{M, cyipi — mm|M, 9 }) 

with 9 G (0,1), u G (0,1] and A{t) = pi(f) — P' 2 it). 

Proof: The proof of the theorem is provided in the Appendix, [71 □ 

Corollary 3.2. If null hypothesis is true then cfit, s) —)■ c{t, s) for all u G (0,1]. 

Some more interesting observations, which show the greater applicability of Theorem 13.11 


are immediate from it. 
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Remark 3.3. It can be easily checked that c„(t, s) is a positive dehnite, symmetric satisfying 


and hence is a covariance kernel. 



(t, s) dtds < oo, 


Remark 3.4. If m = 1, that is, if commonly used estimator of c{t, s) is used then it is readily 
observable that, under alternat i ve, ci (t, s) —)■ c(t, s) + 6*(1 — 6 *)A(t)A(s) = c(t, s), say, which 


is also proved by 


Berkes et al. 


(120091) . We note here that whenever Hq is false, Ci{t,s) has 


a constant bias 9{1 — 9)A(t)A{s). Therefore, for any u G (0,1), the asymptotic bias of the 
estimator Cu(t,s) is less than that of ci(t,s) under alternative hypothesis. 


^ p 

Remark 3.5. lfu = 9, that is, when the data is partitioned in true position, then cg(t, s) 
c(t, s) and in that case asymptotic bias of cg(t, s) is zero whereas asymptotic bias of ci(t, s) 
remains 9{1 — 9)A{t)A{s). 


A few more notations and dehnitions are needed to be introduced here to state the further 
results 


Definition 3.6. The orthonormal functions 0 Ju(t) in P{j) corresponding to real scalars 7 ^ 
are dehned as orthonormal eigenfunctions corresponding to eigenvalues 7 ^ of the covariance 
operator Cu{-) from P{t) to P{t), dehned as Cu{x)(t) = f Cu(t, s)x(s) ds, satisfying the 
relation 

j Cu{t,s)ul{s)ds = 'ylul{t) (3.4) 

Definition 3.7. The estimates of the eigenvalues 'yl and are denoted as and v^, 
satisfying the relation 

j Cu{t, s)vl{s) ds = (3.5) 

With the above two dehnitions we have the following important observations can be noted 
Corollary 3.8. Under the assumption A 4 , for every 1 < I < d and u G (0,1], we have 

xi —t ll 

J [Kit) dt A 0 , 

Where'S^ — sgn(ijl,vl). 


(3.6) 

(3.7) 
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Proof: The proof follows from the Theorem 13.11 and lemmas 4.2 and 4.3 of ?. □ 

Remark 3.9. Under Hq, for all 1 < / < d and u G (0, 1], A and vl converges to in 
probability, in Moreover, under alternative hypothesis, if m = 6 * then for all 1 < / < d, 

Xq a a* and Vg converges to in probability, in It, in fact, can be easily seen that 

sup [[vl{t) - dt A 0, 

0<u<l J 

In the direction of the eigenfunctions v\^ corresponding to the largest d eigenvalues A^ the 
noncentral scores can be obtained as 

AKu) = j i = l,...,N, / = 1,... d. (3.8) 

Utilizing the score functions, as dehned above, we provide a statistic and its distributional 
convergence in the following theorem which will important to know to construct the test 
statistic and perform the asymptotic test. First we dehne the statistic based on the self 
normalized partial sums in d dimensions 



Further denoting i?i(•),..., Bd{-) be the standard independent Brownian bridges, the theo¬ 
rem is provided 


Theorem 3.10. Let the assumptions A1 to A3 hold. Then with the proper embedding of 
Skorohod topology in iA[0,1], under Hq 


/?«(«) A A" 


.u 


0 < M < 1. 


1=1 


Proof: Proof of the theorem is given in Appendix, [71 
Finally we dehne the test statistic as follows: 

d N ^ / [Nm] n 


l=l [Au]=l « \ i=l 


AKA - aka 

i=l 


(3.10) 

□ 

(3.11) 


Using the Theorem 13.101 it is immediate to see that 
cause integral is a continuous functional and U{Rn{-)) A 


—)■ A Sf=i Bf{u)du under Hq, be- 




for any continuous 


, 1=1 
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functional U : D[0,1] —?• 3? fsee iBerkes et al.l (I2009h for furt her details). The distribution of 
the limiting rando m variable can be found in iKieferl fjl959l) and its (1 — Q!)th quantile are 
given in Table 1 of iBerkes et al.l (120091) . We use this asymptotic critical values for performing 


the tests and Hq is rejected at 100(1 — a)% conhdence level if the observe d value of Hff d is 
bigger than the tabulated (1 — Q!)th quantile Kd{a) in 


Berkes et al. 


fl2009h 


Now we show that the proposed test is consistent under the alternative hypothesis. Ba- 

p 

sically we show here that t oo under the hypothesis of single change point. The 

following theorem assures the claim. 


Theorem 3.11. Under the assumption A4, 


1=1 “'o 

where gi{u) = min{6*,M} (1 — maxj^ju}) /S.{t)uj[^{t)dt. 

Proof: The proof follows from Theorem 13.11 and the following lemma. 


□ 


Lemma 3.12. Under the assumption A4, sup 

0<U<1 


N-^Rd,{u) - Y, 




1=1 


Tn 


- c'p(l)- 


Proof: Proof of the lemma follows from the proof of the Theorem 2 of 

□ 


Berkes et al. 


(120091) 


Clearly fr om Theorem 


Similar to 


Berkes et al 


3.111 if Jg > 0 for some 1 < / < d, then H^^d ^ oo. 


(1200911 . the change point 9 is estimated by hnding the value of u 


which maximizes the function For uniqueness we dehne the estimator formally as 

9n = inf{M' : Rn{u') = sup i? 7 v(M)}- (3.12) 

0<u<l 

P 

It can be easily shown that (using lemma l3.12p . under the assumption A4, Ojy —»• 9 provided 


< A, > 7 ^ 0 for all u G (0,1] (see for example the proposition 1 and its proof of iBerkes et al 
(1200911 1. 


4. Simulation studies 

In this section we report a summery of the extensive simulation studies that we have 
conducted for moderate and large sample sizes. As proposed in Section 3, we reject the 
null hypothesis when the observed value of Hjsi^d exceeds the corresponding critical value 
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Kd{a). The critical values that are available in flBerkes et al.l. l2009l. Table 1). Without loss 


of generality initial mean function is considered to be zero. For the hrst set of simulation 
studies the samples are generated from the standard Brownian motion (BM) over the interval 
[0,1] and a drift of amount t and sin(f) are considered after the presumed locations of change 
point. The same is done for the standard Brownian bridge over [0,1] and the mean shift 
after the change point is considered to be a quadratic function 0.8t(l — t). To generate a 
sample from each of such Gaussian processes 1000 equidistant grid points are used. 750 
Bspline basis functions are used to convert the grid data to functional data and hrst 3(= d) 
eigenfunctions are used to execute the testing procedures. For a pre-decided sample size and 
a specihc change point the entire process is replicated 10000 times to assess the power of 
the test. The considered sample sizes (N) are 50,100,150, 200, 300, 500. For any particular 
sample size different possible locations of change points {k*) are chosen, to cover a wide 
range, which are summarized in Table [Hand Figure [H Figure O For all practical purposes, 
we use the complete data together for computing the estimated covariance kernel when [Nu] 
= 1 or [A^u] = iV — 1, otherwise as proposed in equation 


4.1. Small sample bias correction: For small sample size (less than or equals to 100, say) 
we observe some huctuations in the empirical size of the proposed test based on HN,d- To 
overcome this instability we propose a bias correction which helps us to get empirical size 
reasonably close to 0.05. Under the null, it is easy to observe that 

E[cuit, s)] = 1^1 - c(t, s). (4.1) 

So we suggest to multiply the correction factor with (1 — 2/iV)~^ with Cu{t, s) to obtain the 
satisfactory results. Indeed for the large sample the effect of the correction factor vanishes 
automatically and it hardy matters whether we use it or not. 


4.2. Simulation findings: In all of the cases we hnd that the pow er curves for the prop osed 


test based on HN,d strictly dominates that of the proposed by 


Berkes et ah 


(120091). For 


large sample (200 and above, say) the two power curves get very close to each other. But for 
small sample we observe a remarkable gap between these two. In particular, we provide the 
details of power for N = 100 and d = 3 at different point of change points starting from 15 
to 85 for Brownian motion and Brownian bridge in Table [H We add two different functions, 
namely t, sint with the mean of Brownian motion and add 0.8f(l — t) with the mean of 
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standard Brownian bridge. In all of the ab ove cases it is fonnd that the proposed method 


has more power than that of the method by 


Berkes et ah 


(120091) for all different locations of 


change points. The Fignre[T]and the Fignre [2] show the powers of two methods for sample 
size 50(= N) at different point of changes, where the data have been simulated from standard 
Brownian motion and two different functions, t and sin t are added separately with its mean 
at different locations of change for illustration purposes. It can be cle arly observed that i f 


sample size is small then our method is outperforming the method of 


Berkes et ah 


(1200911 


with much larger difference. We also have done simulations with different sample sizes and 
varieties of functions, e. g. y/i, exp(f), cos(f) etc, being added to the mean of Brownian 
motion and Brownian bridge, and in all cases we have found that our method has a better 
power than that of existing method. This hnding is quite intuitive because both test are 
asymptotic tests (both converging to the same asymptotic di stribution) and the proposed 


one always has higher power than that of 


Berkes et af 


(120091 ). mainly because the bias in 


the newly proposed estimate of covariance kernel under alternative is smaller than that in 
the usual estimate of covariance kernel used elsewhere. This satishes the desirable quality of 
a better asymptotic test. We also observe quite good performance of the test statistic when 
the location of change point is < N/4 and > 3iV/4. 


5. Real Data analysis 

The hndings of real data analysis to show the performance of proposed test is demon¬ 
strated in this section. Two temperature data have been analyzed. One data consists of 
average daily temperatures of central England for 228 years, from 1780 to 2007. The data 
has been taken from the website of British Atmospheric Data Centre. The second data, 
taken from Carbon Dioxide Information Analysis Center, consists of monthly global average 
anomaly of the temperatures from 1850 to 2012. Thus, these two data sets can be viewed 
as 228 curves with 365 measurements on each curve and 163 curves with 12 measurements 
on each curve, respectively. These two data sets are converted to functional data using 12 
B-spline basis functions and 8 B-spline basis functions, respectively. Now we discuss the 
performance of the test statistics on these two temperature data sets individually. 

To use the proposed test statistic for temperature data of the central England we use hrst 
8 (= d) eigenfunctions explaining about 85% of the total variability. Given the test indicates 
a change, the change point is estimated by calculating 9^ as described in the Lemma (I3.12p . 
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Thereafter dividing the data set into two parts the procednre is repeated for the each part 
nntil the test fails to reject the null hypothesis. The outcome of our method on this data 
has been provided in T able El It caii be see n that the change points detected by our method 


and by the method of 


Berkes et ah 


(120091) are very much adjacent. Both of the methods 


have detected 1850 and 1926 as possible change points. In case of other years of change 
point it is observed tha t the timings are ve ry close, for example our method has detected a 


change in 1810 whereas 


Berkes et ah 


(1200911 has de tected a change in 1 808 and in the recent 


years our method has detected a change in 1989 and 


Berkes et ah 


( 2009[l 


has detected 1993 as 


possible change point. Overall, it is important to note that both these methods have detected 
four change points in the given data. Table El also shows the p-values corresponding to the 
observed value of the statistic for both of the methods. From the p-values it is noted that 
the p-values of proposed test are much more smaller than the p-values of existing method 
showing the greater power of our test. The mean functions for each partitioned data sets 
are provided in the Figure |3l The picture clearly shows that there is a upward trained in 
the structure of the mean function from one period to other. 

For the monthly average anomaly of the global temperature data of 163 years, hrst 3 (= d) 
eigenfunctions are used which explains about 96% variability of the total variation. We apply 
the same procedure as as done in the case of the previous data set to detect the changes. 
Table [3] shows the outcomes of the test. The functional data representation of the complete 
data and segment wise mean functions are shown in Figure 0] which reflects the prominent 
changes around the mentioned period of year. From the analysis of the second the data 
set we clearly observe that the global temperature is changing (more specihcally increasing) 
signihcantly over the period of time. 


6. Discussions and conclusions 

In this paper we have proposed a new test for testing the existence of a change point in 
a given sequence of independent functional data. It is shown that the null distribution of 
proposed test is asymptotically pivotal. We have proven that under the null hypothesis the 
distribution of the test statistics is a functional of the sum of squares of Brownian bridges. 
Moreover, it has been established that under alternative hypothesis of single change point 
the power of the proposed test goes to unity when sample sizes increases to inhnity. While 
developing the test statistic we have proposed an alternative estimator of the covariance 
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kernel, which is not only a consistent estimator of the true covariance kernel under the 
null hypothesis but also it has lesser bias than the existing usual estimate of covariance 
kernel under the alternative hypothesis. In fact it is successfully shown that even under the 
alternative hypothesis, if the data is divided at the true point of change then our estimate 
has zero asymptotic bias whereas the existing estimate of covariance kernel mostly used in 
change point literature in functional data has a constant asymptotic bias. Because of the fact 
that our used estimate of covariance kernel has a smaller bias than the existing one under 
any circumstances, we are able to show that our test has greater power than the existing 
one for testing the presence of change point in a given sequence of functional data. The 
extensive simulation studies support such a claim also. Specially when the data size is not 
very big then our method outperforms the existing one with a great margin. 

We have used our method in two real data to see the performance o f our test in p r actice . 
One of these data is central England temperature which is also used in iBerkes et ah (2009), 
and the other one is the gl obal tempera t ure d ata. In case of hrst data, it is seen that our 


method and the method of 


Berkes et ah 


(120091 ) both, have pointed four changes in the data 


sequence. Two time points have exactly matched for two methods, namely 18 50 and 1926 


For two other change points two methods differ marginally. 


Berkes et ah 


( 2009 1 has detected 


1808 as possible c hange point whereas our method detected 1810 as possible change point. 


For the other one 


Berkes et al 


(120091) detected 1993 as a possible change point and our 


method indicated 1989 as a possible change point. We have plotted the mean function for 
each of the different segments which clearly shows an upward trend in the mean temperature 
over the said period s. The mean curves of different time segments are very similar to that of 
Berkes et al.l (l2009h which make sure the little observed difference in change points among 
two methods in this particular real data are not major. For the second data, which is global 
monthly temperature data from 1850 to 2012, is analyzed based on our method. It is found 
that there exists three change points around 1933, 1986 and 1996. The analysis of global 
temperature in terms of Ending change points will help the scientists working on the global 
temperature. It clearly shows that in last three decades the temperature has increased 
significantly over the past. 

To conclude we evince that the proposed method has asymptotic null pivotal distribution 
with greater power than the existing method for testing the presence of change in a sequence 
of functional data and hence can be used in practice with more confidence. 
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7. Appendix 

1 fc 1 ^ 

Proof of the Theorem |3.1t Define 'jlkit) = — E Riici ^ ^ ^ ^ for 

i=l i=k-\-l 

some k = [A^u] and k* = [N9] to express the estimated covariance kernel as 


Cu{t,s) = ^ 


N 


- ^k(t)}{Xi{s) - /ifc(s)} + ^ - /ifc(t)}{Aj(s) - Hk{s)} 


i=l 




It immediately gives 


N 


1 k 

Cuik s) = — 2^{V(t) - pjv(t)}{V(s) - /Ijv(s)} - —{Pfc(t) - PiV W}{hfc(s) - Av(s)} 


i=l 


N 

k ^ _ 

- ^N(t)}{/4fc(s) - /Ijv(s)} 


For k < k*, note that 


hfc(t) = Ykit)+ni{t) where, Yk{t) = r^Yj^t), 


i=l 


N-k 


N 


f^k{t) = Dfc(f) +/i 2 (t) + ( ^ ) A(f) where, Yk{t) = t 


i=k-\-l 


and 


AnW = Y^it) + ( M ) /i2(t) 


Now observe that 


/Ifc(t) - /iAr(f) = yfc(t) - Y^it) + ( 1 


(14*) - Mt) = ri(«) - ?„(*) - A(i) 


and 
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to get the following deductions, 

Cu{t,s) = 


1 / N — k*'^ ^ 

- Ms)} - A(t)A(s) 


N 


i=l 


N 


k 


N-k 


k{Y,(t) - f «(«)}{d(s) - y ;,(s)} - (i - |r) {yt(«) - y»(i)}{n(i>) - fA. 


_fc / r 
TV r “ iv 


{YM - n(t)}A(s) + {n(s) - n(s)}A(t) 


Again, 


N 




i=l 

N 


^ - YN{t)}{YM - Yn{s)} + ^ ( 1 - ^ ) A(t)A(.) 


2=1 


k* 


k* 


'n n 


{Yk{t) - n(t)}A(s) + {n(s) - n(s)}A(t) 


gives. 


Cu{t,s) = 


1 

N 


N 


2=1 


b--'l 

# 

1 

1 

1_ 

V N) 

1 

h 

1 

At 

* 

1_ 




+ 11-^) m 


k* 

iV 


^{Y, 4 t) - Y, 4 t)} - ^{YM - n(t)} 


A(t)A(s) 


V (^ ■ N I “ n(«)}{yA(s) - y»(s)} 


N 


= ;^Ew(i)-y«(«)}{y.w-y«w}+«(i-«)A(«)AW am 


2=1 


+ri((, s) + r 2 (t, s) + rslt, s), say 


(7.1) 


Using the law of large numbers for independent, identically distributed Hilbert-space- 
valued random variables (see for example theorem 2.4 of ?), we obtain 




rf(t, s)dtds —)■ 0 and r 2 (t, s)dtds —)■ 0 as iV ^ cx). 


(^)} 
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At the same time using theorem 5.1 of lHorvath and Kokoszkai fl2012l) we get 


y J rl{t,s)dtds^ yj T'^{t)dtj , 

where {T{t) : t G r} is a Gaussian process with E{T(t)) = 0 and E(T(t)T{s)) = c{t, s), which 
in turn implies that 

J J rl{t,s)dtds A 0 as —)■ cx). 

These help to conclude that 

\cu{t, s) — Cu{t, s)]‘^dtds A 0 as A^ —>■ cx). 

The similar proof holds when k > k*. It is easy to see that under the null hypothesis 



c„(t, s) —)■ c(t, s) Mu G (0,1] as A^ —)■ CX3 


□ 


Proof of Theorem 13.101 

The proof follows fr o m the Theorem 13.11 Corollary 13.81 and the proof of Theorem 6.1 of 


Horvath and Kokoszkai (120121) . 
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Table 1. Power comparison of two tests with test statistics Sjv.d and Hjv.d for different k* 


N = 100, d = 3 BM,BM+t 

BM,BM+sm(t) 

BB,BB+0.8(1 -t)t 

fc* 

•S'jv.d 

HN,d 

SN,d 

HN,d 

SN,d 

HN,d 

0 

4.6* 

5.5 

4.6* 

5.5 

4.6* 

5.0 

15 

36.3 

39.9 

30.9 

35.4 

11.2 

12.9 

20 

57.3 

62.0 

44.6 

49.5 

15.3 

16.5 

25 

72.0 

75.6 

61.2 

64.7 

19.5 

21.3 

35 

92.9 

94.2 

80.1 

83.4 

28.0 

31.8 

50 

94.9* 

95.8 

88.0* 

90.1 

34.7 

37.4 

65 

91.0 

92.9 

81.5 

83.7 

31.1 

33.9 

75 

74.3 

78.1 

59.0 

64.4 

21.9 

23.8 

80 

58.8 

64.3 

46.1 

50.2 

13.7 

16.1 

85 

36.4 

40.1 

27.8 

33.0 

12.9 

14.1 


* The values are reported from the tables provided by Berkes et. al (2009, Table 3 ). 


Table 2. Comparisons of the performance of Sjv.d and H^.d for UK temperature data 


Performance of 


Year 

Segment 

Observed 

SN,d 

Obtained 

P-value 

Estimated 

Change point 

1780-2007 

8.020593 

0.00000 

1926 

1780-1925 

3.252796 

0.00088 

1808 

1808-1925 

2.351132 

0.02322 

1850 

1926-2007 

2.311151 

0.02643 

1993 


Performance of Lljv.d 


Year 

Segment 

Observed 

TIjV.d 

Obtained 

P-value 

Estimated 

Change point 

1780-2007 

9.820036 

0.00000 

1926 

1780-1926 

3.764348 

0.00011 

1850 

1780-1850 

2.403308 

0.01900 

1810 

1927-2007 

2.649414 

0.00797 

1989 


* The values are reported from the tables provided by Berkes et. al (2009 , Table 4). 


Table 3. change points for average anomaly global temperatnre data 


Performance of Hjv.d 

Year Segment 

Observed HN,d 

Obtained P-valne 

Estimated Change point 

1850-2012 

23.63304 

0.00000 

1933 

1934-2012 

13.46585 

0.00000 

1986 

1987-2012 

4.34103 

0.00000 

1996 
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BM, BM+t 



Figure 1. Power comparison of Hn,d and Sn,d for N = 50 and d = 3 with A(f) = t. 


BM, BM+sin(t) 



15 20 25 30 35 

Change points 

Figure 2. power comparison of H^^d and Sn^d for N = 50 and d = 3 with A(t) = sin(t). 
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CURE 3. 


Segment wise mean functions of central England temperature data 
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Figure 4. Segment wise mean functions of average anomaly of global temperature data 



















